# Traceability in the Wild: Automatically Augmenting Incomplete Trace Links

* There is one zip file for every project
* The structure for each project is identical and described in the section below

# JSON Export format description
 
## General
* Each project is exported to its own folder (which is zipped)
* This folder contains
    * a subfolder for each **issue type**
    * a list of file describing artifact links, namely ```_issue_links.json```, ```_issue_to_change_set.json```, and ```_change_set_to_code.json``` 
* The **issue type** folders contain one file per issue. The name of the files is the **unique** key of the issue, e.g. ```DERBY-263.json```
* Each file uses JSON to describe data

## File formats

### File: <Issue-Id>.json
* Key value pairs to describing an issues
* *id* is the unique identifier of an issue
* created and resolved date are both in ISO6801 Format, timezone UTC
* Example
    ```json
    {
      "priority": "Major",
      "resolved_date": "2006-02-13T16:58:58Z",
      "resolution": "Done",
      "type": "Task",
      "id": "JBRULES-16",
      "status": "Closed",
      "created_date": "2006-02-13T16:57:00Z",
      "description": "None",
      "summary": "[JBRULES-16] move PropagationContextImpl to common"
    }
    ```
    
### File: _issue_links.json
* List of entries
* Each entry describes a directed link from an issue (source issue) to another issues (target issue)
* Link has semantics (is named)
* Issues are identified by their ids
* Example:
    ```json
    [
      {
        "source_issue_id": "DROOLS-1128",
        "target_issue_id": "DROOLS-1114",
        "semantics": "Related"
      },
      {
        "source_issue_id": "DROOLS-1125",
        "target_issue_id": "DROOLS-1127",
        "semantics": "Related"
      }
    ]
    ```

### File: issue_to_change_set.json
* Mapping from issues to change sets of the version control system (git)
* One issue may have multiple change sets
* Each link
    * Change sets are identified by their *unique* hash 
    * Has _certainty_, a value ranging from 0.0 to 1.0. A value of 1.0 describes a fully trusted link, i.e. directly mined from the original data. Lower values indicate artificially added (augmented) links
* Example:
    ```json
    {
      "DERBY-1802": [
        {
          "commit_hash": "ce0b604847d0ca2ab7b2103e24e6f278077c97f0",
          "certainty": 1.0
        },
        {
          "commit_hash": "7a2df4c6fa2ef5c6e60c9b3212a799dcb79be4bc",
          "certainty": 1.0
        }
      ],
      "DERBY-1556": [
        {
          "commit_hash": "f061ecd1a1096ce0b8e3ccf28e3da7cfcae371d5",
          "certainty": 1.0
        },
        {
          "commit_hash": "9e2d42bf8031f4696dbb4e1f6d49319c53010335",
          "certainty": 1.0
        },
        {
          "commit_hash": "7d16ff3d1e605ddb26e8aadc101859855d9742b3",
          "certainty": 1.0
        }
      ]
    }
    ```

### File: _change_set_to_code.json
* Contains data about change sets
    * Commit message used by developer
    * Timestamp (UTC) of commit date in ISO6801 Format
    * modified files
* Change sets ar identified by their *unique* hash, file by their path in the code repository
* Example
    ```json
    {
      "0a8f8408a5e4070ec8a88f786601e601c30d3da0": {
        "message": "DERBY-5792: Make it possible to turn off encryption on an already encrypted database.\n\nSimplified code removing old container files generated during encryption and\ndecryption of a database. There were two implementations, I removed one of them\nand removed the parameter of EncryptOrDecryptData.removeOldVersionOfContainers\n(and calling methods).\n\nPatch file: derby-5792-5b-old_container_removal_cleanup.diff\n\n\ngit-svn-id: https://svn.apache.org/repos/asf/db/derby/code/trunk@1394522 13f79535-47bb-0310-9956-ffa450edef68\n",
        "committed_date": "2012-10-05T13:49:10Z",
        "file_path": [
          "java/engine/org/apache/derby/iapi/store/raw/data/DataFactory.java",
          "java/engine/org/apache/derby/impl/store/raw/RawStore.java",
          "java/engine/org/apache/derby/impl/store/raw/data/BaseDataFileFactory.java",
          "java/engine/org/apache/derby/impl/store/raw/data/EncryptOrDecryptData.java"
        ]
      },
      "f32e51e193c80b5968fca03977d7884403d7d450": {
        "message": "DERBY-615 Make RunTest install a SecurityManager when using useprocess=false.\nAdd a utility SQL test function to indicate if a SecurityManager is installed.\n\n\ngit-svn-id: https://svn.apache.org/repos/asf/db/derby/code/trunk@365776 13f79535-47bb-0310-9956-ffa450edef68\n",
        "committed_date": "2006-01-03T23:59:10Z",
        "file_path": [
          "java/testing/org/apache/derbyTesting/functionTests/harness/RunTest.java",
          "java/testing/org/apache/derbyTesting/functionTests/harness/jvm.java",
          "java/testing/org/apache/derbyTesting/functionTests/util/TestRoutines.java"
        ]
      }
    }
    ```
* To get the file content for for a specific commit, teh following command line can be used (the example uses the first file of the first commit shown above)
    ```commandline
    # cd /path/to/derby_repo
    # git show 0a8f8408a5e4070ec8a88f786601e601c30d3da0:java/engine/org/apache/derby/iapi/store/raw/data/DataFactory.java
    ```
